Search CORE

355 research outputs found

Half-Duplex Relaying for the Multiuser Channel

Author: Lei Ming
Soleymani Mohammad Reza
Publication venue
Publication date: 14/01/2014
Field of study

This work focuses on studying the half-duplex (HD) relaying in the Multiple Access Relay Channel (MARC) and the Compound Multiple Access Channel with a Relay (cMACr). A generalized Quantize-and-Forward (GQF) has been proposed to establish the achievable rate regions. Such scheme is developed based on the variation of the Quantize-and-Forward (QF) scheme and single block with two slots coding structure. The results in this paper can also be considered as a significant extension of the achievable rate region of Half-Duplex Relay Channel (HDRC). Furthermore, the rate regions based on GQF scheme is extended to the Gaussian channel case. The scheme performance is shown through some numerical examples.Comment: 7 pages, 4 figures, conference pape

arXiv.org e-Print Archive

Crossref

Automatic tagging and geotagging in video collections and communities

Author: Jones Gareth J.F.
Larson Martha
Serdyukov Pavel
Soleymani Mohammad
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/04/2011
Field of study

Automatically generated tags and geotags hold great promise to improve access to video collections and online communi- ties. We overview three tasks offered in the MediaEval 2010 benchmarking initiative, for each, describing its use scenario, definition and the data set released. For each task, a reference algorithm is presented that was used within MediaEval 2010 and comments are included on lessons learned. The Tagging Task, Professional involves automatically matching episodes in a collection of Dutch television with subject labels drawn from the keyword thesaurus used by the archive staff. The Tagging Task, Wild Wild Web involves automatically predicting the tags that are assigned by users to their online videos. Finally, the Placing Task requires automatically assigning geo-coordinates to videos. The specification of each task admits the use of the full range of available information including user-generated metadata, speech recognition transcripts, audio, and visual features

Irish Universities

DCU Online Research Access Service

Privacy-preserving Representation Learning for Speech Understanding

Author: Soleymani Mohammad
Tran Minh
Publication venue
Publication date: 26/10/2023
Field of study

Existing privacy-preserving speech representation learning methods target a single application domain. In this paper, we present a novel framework to anonymize utterance-level speech embeddings generated by pre-trained encoders and show its effectiveness for a range of speech classification tasks. Specifically, given the representations from a pre-trained encoder, we train a Transformer to estimate the representations for the same utterances spoken by other speakers. During inference, the extracted representations can be converted into different identities to preserve privacy. We compare the results with the voice anonymization baselines from the VoicePrivacy 2022 challenge. We evaluate our framework on speaker identification for privacy and emotion recognition, depression classification, and intent classification for utility. Our method outperforms the baselines on privacy and utility in paralinguistic tasks and achieves comparable performance for intent classification.Comment: INTERSPEECH 202

arXiv.org e-Print Archive

A Speech Representation Anonymization Framework via Selective Noise Perturbation

Author: Soleymani Mohammad
Tran Minh
Publication venue
Publication date: 27/10/2022
Field of study

Privacy and security are major concerns when communicating speech signals to cloud services such as automatic speech recognition (ASR) and speech emotion recognition (SER). Existing solutions for speech anonymization mainly focus on voice conversion or voice modification to convert a raw utterance into another one with similar content but different, or no, identity-related information. However, an alternative approach to share speech data under the form of privacy-preserving representation has been largely under-explored. In this paper, we propose a speech anonymization framework that achieves privacy via noise perturbation to a selected subset of the high-utility representations extracted using a pre-trained speech encoder. The subset is chosen with a Transformer-based privacy-risk saliency estimator. We validate our framework on four tasks, namely, Automatic Speaker Verification (ASV), ASR, SER and Intent Classification (IC) for privacy and utility assessment. Experimental results show that our approach is able to achieve a competitive, or even better, utility compared to the speech anonymization baselines from the VoicePrivacy2022 Challenges, providing the same level of privacy. Moreover, the easily-controlled amount of perturbation allows our framework to have a flexible range of privacy-utility trade-offs without re-training any component

arXiv.org e-Print Archive

Best of Affective Computing and Intelligent Interaction 2013 in Multimodal Interactions

Author: Pun Thierry
Soleymani Mohammad
Publication venue
Publication date: 01/03/2015
Field of study

University of Twente Research Information